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Abstract 

Scalar quantization is the most practical and straightforward approach to signal quantization. However, 
it has been shown that scalar quantization of oversampled or Compressively Sensed signals can be 
inefficient in terms of the rate-distortion trade-off, especially as the oversampling rate or the sparsity of 
the signal increases. In this paper, we modify the scalar quantizer to have discontinuous quantization 
regions. We demonstrate that with this modification it is possible to achieve exponential decay of the 
quantization error as a function of the oversampling rate instead of the quadratic decay exhibited by 
current approaches. Our approach is universal in the sense that prior knowledge of the signal model is 
not necessary in the quantizer design, only in the reconstruction. Thus, we demonstrate that it is possible 
to reduce the quantization error by incorporating side information on the acquired signal, such as sparse 
signal models or signal similarity with known signals. In doing so, we establish a relationship between 
quantization performance and the Kolmogorov entropy of the signal model. 

Index Terms 

scalar quantization, randomization, randomized embedding, oversampling, robustness 

I. Introduction 

In order to digitize a signal, two discretization steps are necessary: sampling (or measurement) and 
quantization. The first step, sampling, computes linear functions of the signal, such as the signal's 
instantaneous value or the signal's inner product with a measurement vector. The second step, quantization, 
maps the continuous-valued measurements of the signal to a set of discrete values, usually referred to 
as quantization points. Overall, these two discretization steps do not preserve all the information in the 
analog signal. 

The sampling step of the discretization can be designed to preserve all the information in the signal. 
Several sampling results demonstrate that as long as sufficiently many samples are obtained given the 
class of the signal sampled it is possible to exactly recover a signal from its samples. The most celebrated 
sampling result is the Nyquist sampling theorem which dictates that uniform sampling at a frequency 



2 



at least twice the bandwidth of a signal is sufficient to recover the signal using simple bandlimited 
interpolation. More recently, Compressive Sensing theory has demonstrated that it is also possible to 
recover a sparse signal from samples approximately at its sparsity rate, rather than its Nyquist rate or 
the rate implied by the dimension of the signal. 

Unfortunately, the quantization step of the process, almost by definition, cannot preserve all the 
information. The analog measurement values are mapped to a discrete number of quantization points. 
By the pigeonhole principle, it is impossible to represent an infinite number of signals using a discrete 
number of values. Thus, the goal of quantizer design is to exploit those values as efficiently as possible 
to reduce the distortion on the signal. 

One of the most popular methods for quantization is scalar quantization. A scalar quantizer treats and 
quantizes each of the signal measurements independently. This approach is particularly appealing for its 
simplicity and its relatively good performance. However, present approaches to scalar quantization do 
not scale very well with the number of measurements [[lJ-Q. Specifically, if the signal is oversampled, 
the redundancy of the samples is not exploited effectively by the scalar quantizer. The trade-off between 
the number of bits used to represent an oversampled signal and the error in the representation does not 
scale well as oversampling increases. In terms of the rate vs. distortion trade-off, it is significantly more 
efficient to allocate representation bits such that they produce refined scalar quantization with a critically 
sampled representation as opposed to coarse scalar quantization with an oversampled representation. 

This trade-off can be reduced or eliminated using more sophisticated or adaptive techniques such 
as vector quantization, Sigma-Delta (EA) quantization (5j-||7j, or coding of level crossings (8j. These 
methods consider more than one sample in forming a quantized representation, either using feedback 
during the quantization process or by grouping and quantizing several samples together. These approaches 
improve the rate vs. distortion trade-off significantly. The drawback is that each of the measurements 
cannot be quantized independently, and they are not appropriate when independent quantization of the 
coefficients is necessary. 

In this work we develop the basis for a measurement and scalar quantization framework that sig- 
nificantly improves the rate-distortion trade-off without requiring feedback or grouping of the coeffi- 
cients. Each measured coefficient is independently quantized using a modified scalar quantizer with 
non-contiguous quantization intervals. Using this modified quantizer we show that we can beat existing 
lower bounds on the performance of oversampled scalar quantization, which only consider quantizers 
with contiguous quantization intervals (2j, J9|. 

The framework we present is universal in the sense that information about the signal or the signal model 
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is not necessary in the design of the quantizer. In many ways, the quantization method is reminiscent 
of information theoretic distributed coding results, such as the celebrated Slepian-Wolf and Wyner-Ziv 



coding methods (TOJ, |TTJ. While we only analyze 1-bit scalar quantization, we discuss how the results 
can be easily extended to multibit scalar quantization. 

One of the key results we derive in this paper is the exponential quantization error decay as a function 
of the oversampling rate. To the best of our knowledge, it is the first example of a scalar quantization 
scheme that achieves exponential error decay without further coding or examination of the quantized 
samples. Thus, our method is truly distributed in the sense that quantization and transmission of each 
measurement can be performed independently of the others. 

Our result has similar flavor with recent results in Compressive Sensing, such as the Restricted Isometry 
Property (RIP) of random matrices |T2j-|[T5[. Specifically, all our proofs are probabilistic and the results 
are with overwhelming probability on the system parameters. The advantage of our approach is that we 
do not impose a probabilistic model on the acquired signal. Instead, the probabilistic model is on the 
acquisition system, the properties of which are usually under the control of the system designer. 

The proof approach is inspired by the proof of the RIP of random matrices in p"5| . Similarly to p3| 
we examine how the system performs in distinguishing pairs of signals as a function of their distance. We 
then extend the result on distinguishing a small ball around each of the signals in the pair. By covering 
the set of signals of interest with such balls we can extend the result to the whole set. The number of 
balls required to cover the set and, by extension, the Kolmogorov entropy of the set play a significant 
role in the reconstruction performance. While Kolmogorov entropy is known to be intimately related to 
the rate-distortion performance under vector quantization, this is the first time is tied to the rate-distortion 
performance under scalar quantization. 

We assume a consistent reconstruction algorithm, i.e., an algorithm that reconstructs a signal estimate 
that quantizes to the same quantization values as the acquired signal (3j. However, we do not discuss any 
practical reconstruction algorithms in this paper. For any consistent reconstruction algorithm it suffices 
to demonstrate that if the reconstructed signal is consistent with the measurements, it cannot be very 
different from the acquired signal. To do so, we need to examine all the signals in the space we are 
interested in. Exploiting and implementing these results with practical reconstruction algorithms is a topic 
for future publications. 

In the next section, which partly serves as a brief tutorial, we provide an overview of the state of the 
art in scalar quantization. In this overview we examine in detail the fundamental limitations of current 
scalar quantization approaches and the reasons behind them. This analysis suggests one way around 
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the limitations, which we examine in Sec III In Sec. IV we discuss the universality properties of our 
approach and we examine how side-information on the signal can be incorporated in our framework to 
improve quantization performance. In this spirit, we examine Compressive Sensing and quantization of 
similar signals. Finally, we discuss our results and conclude in Sec. [V] 

II. Overview of Scalar Quantization 

A. Scalar Quantizer Operation 

A scalar quantizer operates directly on individual scalar signal measurements without taking into 
account any information on the value or the quantization level of nearby measurements. Specifically, the 
generation of the m th quantized measurement from the quantized signal x G M. K is performed using 

y m = (x, (j) m ) + w m (1) 

q m = Q(^], (2) 



where <p m is the measurement vector and w m is the additive dither used to produce a dithered scalar 
measurement y m which is subsequently scaled by a precision parameter A m and quantized by the 
quantization function Q(-). The index m = 1,...,M, where M is the total number of quantized 
coefficients acquired. The precision parameter is usually not explicit in the literature but is incorporated as 
a design parameter of the quantization function Q(-). We made it explicit in this overview in anticipation 
of our development. 

The measurement vectors can vary, depending on the problem at hand. Typically they form a basis 
or an overcomplete frame for the space in which the signal of interest lies (3j, (9J, 1 16 1. More recently, 



Compressive Sensing demonstrated that it is possible to undersample sparse signals and still be able to 



recover them using incoherent measurement vectors, often randomly generated |13|, 1 17 1 — [20]. Random 
dither is sometimes added to the measurements to reduce certain quantization artifacts and to ensure the 
quantization error has tractable statistical properties. The dither is usually assumed to be known and is 
taken into account in the reconstruction. If dither is not used, w m = for all m. 

The quantization function Q(-) is typically a uniform quantizer, such as the one shown in Fig. [jja) for 
a multi-bit quantizer or in Fig. [T|b) for a binary (1-bit) quantizer. The number of bits required depends 
on the number of quantization levels used by the quantizer. For example Fig. [jja) depicts an 8-level, i.e. 
a log 2 (8) = 3-bit quantizer. The number of levels necessary, in turn, depends on the dynamic range of 
the scaled measurements, i.e., the maximum and minimum possible values, such that the quantizer does 
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not overflow significantly. A 5-bit quantizer can represent of 2 B quantization values, which determines 
the trade-off between accuracy and bit-rate. 

The scaling performed by the precision parameter A m controls the trade-off between quantization 
accuracy and the number of quantization bits. Larger A m will cause a larger range of measurement 
values to quantize to the same quantization level, thus increasing the ambiguity and decreasing the 
precision of the quantizer. Smaller values, on the other hand, increase the precision of the quantizer but 
produce a larger dynamic range of values to be quantized. Thus more quantization levels and, therefore, 
more bits are necessary to avoid saturation. Often non-uniform quantizers may improve the quantization 
performance if there is prior knowledge about the distribution of the measurements. These can be designed 



heuristically, or using a design method such as the Lloyd-Max algorithm |21|, [22]. Recent work has 
also demonstrated that overflow, if properly managed, can in certain cases be desirable and effective in 
reducing the error due to quantization p3) , (24|. Even with these approaches, the fundamental accuracy 
vs. distortion trade-off remains in some form. 

A more compact, vectorized form of ([T} and Q will often be more convenient in our discussion 

y = $ X + W (3) 

q = Q (A-V) , (4) 

where y, q, and w are vectors containing the measurements, the dither coefficients, and the quantized 
values, respectively, A is a diagonal matrix with the precision parameters A m in its diagonal, Q(-) is the 
scalar quantization applied element-by-element on its input, and <I> is the M x K measurement matrix 
that contains the measurement vectors 4> m in its rows. 

B. Reconstruction from Quantized Measurements 

A reconstruction algorithm, denoted R(-), uses the quantized representation generated by the signal to 
produce a signal estimate x = R(q). The performance of the quantizer and the reconstruction algorithm is 
measured in terms of the reconstruction distortion, typically measured using the £2 distance: d = ||x— x||2- 
The goal of the quantizer and the reconstruction algorithm is to minimize the average or the worst case 
distortion given a probabilistic or a deterministic model of the acquired signals. 

The simplest reconstruction approach is to substitute the quantized value in standard reconstruction 
approaches for unquantized measurements. For example, if $ forms a basis or a frame, we can use linear 
reconstruction to compute 

x = $t (Aq _ w ) ; 
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where (-)t denotes the pseudoinverse (which is equal to the inverse of $ is a basis). Linear reconstruction 
using the quantized values can be shown to be the optimal reconstruction method if $ is a basis. However, 
it is suboptimal in most other cases, e.g., if <3? is an oversampled frame, or if Compressive Sensing 
reconstruction algorithms are used Q, (3J, |25|. 

A better approach is to use consistent reconstruction, a reconstruction method that enforces that the 
reconstructed signal quantizes to the same value, i.e., satisfies the constraint q = Q (A -1 (<3?x + w)). 
Consistent reconstruction was originally proposed for oversampled frames in (3j, where it was shown to 
outperform linear reconstruction. Subsequently consistent reconstruction, or approximations of it, have 
been shown in various scenarios to improve Compressive Sensing or other reconstruction from quantized 



measurements |23[ , (24J, p6|-p3|. It is also straightforward to demonstrate that if $ is a basis, the 
simple linear reconstruction described above is also consistent. 



C. Reconstruction Rate and Distortion Performance 

The performance of scalar quantizers is typically measured by their rate vs. distortion trade-off, i.e., 
how increasing the number of bits used by the quantizer affects the distortion on the measurement signal 
due to quantization. The distortion can be measured as worst-case distortion, i.e., 

d W c = max ||x -R(Q (A -1 ($x + w))) || 2 , 

or, if x is modeled as a random variable, average distortion, 

davg = #x { ||x - R (Q (A" 1 ($x + w))) ||J , 

where x = R (Q (A -1 ($x + w))) is the signal reconstructed from the quantization of x. 

In principle, under this sampling model, there are two ways to increase the bit-rate and reduce the 
quantization distortion. The first is to increase the number of bits used per quantized coefficient. In 
terms of the description above, this is equivalent to decreasing the precision parameter A m . For example, 
reducing A m by one half will double the quantization levels necessary and, thus, increase the necessary 
bit-rate by 1 bit per coefficient. On the other hand, it will decrease by 2 the ambiguity on each quantized 
coefficient, and, thus, the reconstruction error. Using this approach to increase the bit-rate, an exponential 
reduction in the average error is possible as a function of the bit-rate 

d = 0{c r ),c<l, (5) 

where r = MB is the total rate used to represent the signal at M measurements and B bits per 
measurement. 
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The second way is to increase the number of measurements at a fixed number of bits per coefficient. 
In Q, (9j it is shown that the distortion (average or worst-case) cannot reduce at a rate faster than linear 
with respect to the oversampling rate, which, at a fixed number of bits per measurement, is proportional 
to the bit-rate; i.e., 

d = 0(l/r), (6) 

much slower than the rate in Q. It is further shown in (2}, (3J that linear reconstruction does not reach 
this lower bound, whereas consistent reconstruction approaches do. Thus, the rate-distortion trade-off does 
not scale favorably when increasing the number of measurements at a constant bit-rate per measurement. 



A similar result can be shown for compressive acquisition of sparse signals [25]. 

Despite the adverse trade-off, oversampling is an effective approach to achieve robustness (3J, |34|-[[38| 
and it is desirable to improve this adverse trade-off. Approaches such as Sigma-Delta quantization can be 
shown to improve the performance at the expense of requiring feedback when computing the coefficients. 
Even with Sigma-Delta quantization, the error decay cannot become exponential in the oversampling 
rate [5], unless further coding is used [39]. This can be an issue in applications where simplicity and 
reduced communication is important, such as distributed sensor networks. It is, thus, desirable to achieve 
scalar quantization where oversampling provides a favorable rate vs. distortion trade-off, as presented in 
this paper. 

The fundamental reason for this trade-off is the effective use of the available quantization bits when 
oversampling. A linearly oversampled i^-dimensional signal occupies only a i^-dimensional subspace 
(or affine subspace, if dithering is used) in the M-dimensional measurement space, as shown in Fig. |2ja). 
On the other hand, the 2 MB bits used in the representation create quantization cells that equally occupy 
the whole M-dimensional space, as shown in Fig(2jb). The oversampled representation of the signal will 
quantize to a particular quantization vector q only if the ivT-dimensional plane intersects the corresponding 
quantization cell. As evident in Fig [2jc), most of the available quantization cells are not intersected by 
the plane, and therefore most of the available quantization points q are not used. Careful counting of 
the intersected cells provides the bound in ([6]) j2}, (9). The bound does not depend on the spacing of 
the quantization intervals, or their size. A similar bound can be shown for a union of A'-dimensional 



subspaces, applicable in the case of Compressive Sensing [25 1, [33]. 

To overcome the adverse trade-off, a scalar quantizer should be able to use most of the 2 MB available 
quantization vectors, i.e., intersect most of the available quantization cells. Note that no-matter how we 
choose the quantization intervals, the shape of the quantization cells is rectangular and aligned with the 
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Fig. 2. Oversampled Signals and Quantization, (a) Oversampled signals occupy only a small subspace in the measurement 
space, (b) The quantization grid quantizes all the measurement space, (c) The signal subspace intersects very few of the available 
quantization cells. 

axes. Thus, improving the trade-off requires a strategy other than changing the shape and positioning of 
the quantization cells. The approach we use in this paper is to make the quantization cells non-continuous 
by making the quantization function non monotonic, as shown in Figs, [jjc) and [TJd). This is, in many 
ways, similar to the binning of quantization cells explored experimentally in [40]. The advantage of our 
approach is that it facilitates theoretical analysis and can scale down to even one bit per measurement. 
In the remainder of this paper we demonstrate that our proposed approach achieves, with very high 
probability, exponential decay in the worst-case quantization error as a function of the oversampling rate, 
and, consequently, the bit-rate. 

III. RATE-EFFICIENT SCALAR QUANTIZATION 

A. Overview 

Our approach uses the scalar quantizer described in ([I]) and ([2]) with the quantization function in 
Figs, [jjc) and[jjd). The quantization function is explicitly designed to be non-mono tonic, such that non- 
contiguous quantization regions quantize to the same quantization value. This allows the subspace defined 
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by the measurements to intersect the majority of the available quantization cells which, in turn, ensures 
efficient use of the available bit-rate. Although we do not describe a specific reconstruction algorithm, we 
assume that the reconstruction algorithm produces a signal consistent with the measurements, in addition 
to imposing a signal model or other application-specific requirements. 

Our end goal is to determine an upper bound for the probability that there exist two signals x and x' with 
distance greater than d that quantize to the same quantization vector given the number of measurements 
M. If no such pair exists, then any consistent reconstruction algorithm will reconstruct a signal that has 
distance from the acquired signal at most d. We wish to demonstrate that this probability vanishes very 
fast as the number of measurements increases. Furthermore, we wish to show that for a fixed probability 
of such a signal pair existing, the distance to guarantee such probability decreases exponentially with the 
number of measurements. An important feature of our development is that the probability of success is 
on the acquisition system randomization, which we control, and not on any probabilistic model for the 
signals acquired. 

To achieve our goal we first consider a single measurement on a pair of signals x, and x' with distance 
d = ||x — x' 1 1 2 , and analyze the probability a single measurement of the two signals quantizes to the 



same quantization value for both. Our result is summarized in Lemma 3. 1 



Lemma 3.1: Consider signals x, and x' with d = ||x — x'||2 and the quantized measurement function 

where Q(x) = \x] mod 2, <p m E M K contains i.i.d. elements drawn from a normal distribution with 
mean and variance a 2 , and Wj~ is i.i.d., uniformly distributed in [0, A]. 

The probability that the two signals produce equal quantized measurements is 

rk . . . ,, 1 e v V5A ) ii _(^dY 

Fix, x consistent of) = - + > o < - H — e \V2&) . 

2 ^( 7 r(z + l/2)) 2 -2 2 



We prove this lemma in Sec. III-B 



Next, in Sec. III-C we consider a single measurement on two e-balls, B e (x) and S e (x'), centered at x 
and x', i.e., on all the signals of distance less than e from x and x'. Using Lemma [3~Tj we lower-bound 



the probability that no signal in B e {x) is consistent with any signal in B e (x'). This leads to Lemma 3.2 



Lemma 3.2: Consider signals x, and x' with d = ||x — x' 1 1 2 , the e-balls £> e (x) and B e (x!) and the 



quantized measurement function in Lemma 3. 1 
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The probability that no signal in B e (x) produces equal quantized measurement with any signal in 
Be(x') (i.e. the probability that the two balls produce inconsistent measurements) is lower bounded by 

P(# e (x),0 e (x') inconsistent | d) > 1 - (V(x,x' consistent^) + ^ + 7 (y> (£)^)) ' 
for any choice of cp < A/2e, where 7(5, x) is the regularized upper incomplete gamma function. 
Finally we construct a covering of the signal space under consideration using e-balls. We consider all 



pairs of e-balls in this covering and using Lemma 3.2 we lower bound the probability than no pair of 
signals with distance greater than d produces consistent measurements. This produces the main result of 



this work, proven in Sec. III-D 



Theorem 3.3: Consider the set of signals 

S = {xe R K \ ||x|| 2 < 1} 

and the measurement system 

f(x,(j) m ) +w m \ 
q m = Q - , m = l,...,M, 



A 

where Q(x) = \x] mod 2, (f> m G M. K contains i.i.d. elements drawn from a standard normal distribution, 
Wk is i.i.d., uniformly distributed in [0, A]. 

For any c r > 1/2, arbitrarily close to 1/2, there exists a constant c a and a choice of A proportional 
to d such that with probability greater than 

p>i-(^) w- 

the following holds for all x, x' G S 

||x - x'|| 2 > d =^ q / q', 
where q and q' are the vectors containing the quantized measurements of x and x', respectively. 



,\ 2K 

The theorem trades-off the leading term ( d ) with how close to 1/2 is c r , i.e., how fast the 



probability in the statement approaches 1 as a function of the number of measurements. Using an example, 
we also make this result concrete and show that for K > 8, we can achieve c a = 60, c r = 3/4. 

Our results do not assume a probabilistic model on the signal. Instead they are similar in nature to 



many probabilistic results in Compressive Sensing [12|-[15|, |17|-|19|. With overwhelming probability 
the system works on all signals presented to it. It is also important to note that the results are not 
asymptotic, but hold for finite K and M. Further note that the alternative is not that the system provides 
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incorrect result, only that we cannot guarantee that it will provide correct results. Thus, we fix the 
probability that we cannot guarantee the results to Pq and demonstrate the desired exponential decay of 
the error. 

Corollary 3.4: Consider the set of signals, the measurement system and the consistent reconstruction 



process implied by Thm. 3.3 With probability greater than 

P > 1 - Po, 

the following holds for all x, x' G S 



x - x'|| 2 > i (c r ) 2K => q' 



P 2/ 



The corollary makes explicit the exponential decay of the worst-case error as a function both of 
the number of measurements M and the number of bits used. This means that the worst-case error 
decays significantly faster than the linear decay demonstrated with classical quantization of oversampled 
frames |TJ S (3j and defeats the lower bound in (2j. Furthermore, we achieve that rate by quantizing each 
coefficient independently, unlike existing approaches (8J, |39|. Since this is a probabilistic result on the 
system probability space, it further implies that a system that satisfies the desired exponential decay 
property exists. 

One of the drawbacks of this approach is that it requires the quantizer to be designed in advance 
with the target distortion in mind, i.e., the choice of the scaling parameter A of the quantizer affects 
the distortion. This might be an issue if the target accuracy and oversampling rate is not known at the 
quantizer design stage, but, for example, needs to be estimated from the measurements adaptively during 
measurement time. This drawback, as well as one way to overcome it, is discussed further in Sec. [V] 

The remainder of this section presents the above results in sequence. 



B. Quantized Measurement of Signal Pairs 

We first consider two signals x and x' with £2 distance d = ||x — x' H2- We analyze the probability 
that a single quantized measurement of the two signals produces the same bit values, i.e., is consistent 
for the two signals. Since we only discuss the measurement of a single bit, we omit the subscript m 
from ([T]) and ([2]) to simplify the notation in the remainder of this section. The analysis does not depend 
on it. We use q and q' to denote a single quantized measurement of x and x', respectively. 
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Fig. 3. Analysis of the probility of consistency for a single bit. (a) Dithering makes this probability depend only on the distance 
between the two signals, (b) Probability of consistency as a function of the projection length, (c) The two components affecting 
the overall probability of consistency. 



We first consider the desired probability conditional on the projected distance I, i.e., the distance 
between the measurements of the signals 

l = \y-y'\ = \(x,4>)+w-{(-x!,4>)+w)\ 
=W = Kx-x',^1 (7) 

The addition of dither makes the probability the two signals quantize to consistent bits depend only on the 
distance I and not on the individual values y, and y', as demonstrated in Fig. [5Ja). In the top part of the 
figure an example measurement is depicted. Depending on the amount of dither, the two measurements 
can quantize to different values (as shown in the second line of the plot) or to the same values (as shown 
in the third line). Since the dither is uniform in [0, A], the probability the two bits are consistent given 
I equals 

f 1 - 1 m ° d A = 1 + 2i - I, if 2iA < I < (2i + 1)A 

{ 1 7 dA = i-(2i + l), if (2* + 1) A < I < 2(i + 1)A, 

for some integer i, which is plotted in Fig. [3jb). 

Furthermore, from (J7]) and the distribution of <fi, it follows that I is distributed as the magnitude of the 

normal distribution with variance (ad) 2 

! ^ = v l> - °- 
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Thus, the two quantization bits are the same given the distance of the signals d with probability 

P(q = q'\d)= [ P(q = q'\l)-f(l\d)dL (9) 
Jl>0 

In order to evaluate the integral, we make it symmetric around zero by mirroring it and dividing it by 
two. The two components of the expanded integral are shown in Fig. [3jc). These are a periodic triangle 
function with height 1 and width 2A and a Normal distribution function with variance (ad) 2 . 

Using Parseval's theorem, we can express that integral in the Fourier domain (with respect to I). Noting 
that the periodic triangle function can also be represented as a convolution of a single triangle function 
with an impulse train, we obtain: 

P(q=q'\l) 

/ A s fW) 



g sinc 2 (i) e _ ( ^ r 



where sinc(x) = sin ( 7rj: ) ; an d £ j s the frequency with respect to I. Since sinc(x) = if x is a non-zero 
integer, sin 2 (-7rx/2) = 1 if x is an odd integer, and sinc(O) = 1, 

P(q = q'\d) = - + Y- (10) 

2 ^(^ + V2)) 2 ' 



which proves the equality in Lemma 3.1 



A very good lower bound for (10 1 can be derived using the first term of the summation: 



P(q = q'\d) > I + 4je 

2 7T Z 



An alternative lower bound can also be derived by explicitly integrating ([9]) up to I < A: 

^ad 



P(q = q'\d) > 1 

V 7T A 

An upper bound can be derived using e ^ vsa ) < e V v^a ) i n the summation in ([lOj), and noting 
that P(q = q'\d = 0) = 1. 

P{q = q'\d) < - + - e -lvfAJ , 

which proves the inequality and concludes the proof of Lemma 3.1 Fig. |4] plots these bounds and 
illustrates their tightness. A bound for the probability of inconsistency can be determined from the 
bounds above using P(q ^ q'\d) = 1 — P(q = q'\d). 
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Fig. 4. Upper and lower bounds for the probability two different signals have consistent quantization bits 

Using these results it is possible to further analyse the performance of this method on finite sets of 
signals. However, for many signal processing applications it is desirable to analyse infinite sets of signals, 
such as signal spaces. To facilitate this analysis the next section examines how the system behaves on 
pairs of e-balls in the signal space. 

C. Consistency of e-Balls 

In this section we examine the performance on pairs of sets of signals. Specifically, the sets we consider 
are e-balls in R K with radius e and centered at x, defined as 

B e (x) = {s G R K \ ||s-x|| 2 < e} . 

We examine balls B e (x) and £> e (x') around two signals x and x' with distance d = ||x — x'||2, as above. 
We desire to lower bound the probability that the quantized measurements of all the signals in B e (x) are 
consistent with each other, and inconsistent with the ones from all the signals in S e (x'). 

To determine the lower bound, we examine how the measurement vector (j) affects the e-balls. It is 
straightforward to show that the measurement projects the B e (x) to an interval in R of length at most 
2e||0||2, centered at (x, <fi). The length of the interval affects the probability that the measurements of all 
the signals in B e (x) quantize consistently. To guarantee consistency we bound the length of this interval 
to be smaller than 2c p e, i.e., we require that ||^>||2 < c p . This fails with probability 

P(ll*>^) = 7(f,(J) 2 ), 

where j(s,x) is the regularized upper incomplete gamma function, and 7 (~, (f) 2 ) i s tne tai l integral 
of the x distribution with K degrees of freedom (i.e., the distribution of the norm of a A'-dimensional 
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Fig. 5. Measurement quantization of e-balls. (a) Ball measurement and consistency behavior, (b) Probability of no consistency 
guarantee given the projection length, (11) . 



standard Gaussian vector). To ensure that all the signals in the e-ball can quantize to the same bit value 
with non-zero probability we pick c p such that 2c p e < A. 

Under this restriction, the two balls will produce inconsistent measurements only if the two intervals 
they project onto are located completely within two quantization intervals with different quantization 
values. Thus we cannot guarantee consistency within the ball if the ball projection is on the boundary 
of a quantization threshold, and we cannot guarantee inconsistency between the balls if parts of the 
projections of the two balls quantize to the same bit. Figure |5Jb) demonstrates the quantization of e-balls, 
and examines when all the elements of the two balls quantize inconsistently. 

Assuming that the width of the ball projections is bounded, as described above, then we can characterize 
the probability that the ball centers will project on the quantization grid in a way that all signals within 
one ball quantize to the same one quantization value, and all the signals from the other ball quantize to 
the other. This is the probability that we can guarantee that all measurements from the signals in one 
ball are inconsistent with all the signals from the other ball. We desire to upper bound the probability 
that we fail to guarantee this inconsistency. 

Using, as before, I to denote the projected distance between the two centers of the balls, we cannot 
guarantee inconsistency if \l — 2iA\ < 2c p e for some i. In this case the balls are guaranteed to intersect 
modulo 2 A, i.e., they are guaranteed to have intervals that quantize to the same value. If 2c p e < I — 2iA < 
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A for some i we consider the projection of the balls and the two points, one from each projection, closest 
to each other. If these, which have distance I — 2c p e modulo 2A, are inconsistent, then the two balls 
are guaranteed to be inconsistent. Similarly, if < I — (2i + 1)A < A — 2c p e for some i we consider 
the projection of the balls and the two points, one from each projection, farthest from each other. If 
these, which have distance I + 2c p e modulo 2A, are inconsistent, then the two balls are guaranteed to be 
inconsistent. Since, given /, the dither distributes the centers of the balls uniformly within the quantization 
intervals, the probability that we cannot guarantee consistency can be bounded in a manner similar to ([8]). 



P(no guarantee)/) < < 



1, if \l - 2iA\ < 2c p e 
A+2c p e-l-2iA ^ if 2c p e < I - 2iA < A (11) 
i+2c p e-{2i+i)A ^ if < / _(2i + i) A < A-2c p e, 
for some integer i. The shape of this upper bound is shown in Fig. [5jb). Note that the right hand side 
of (111 can be expressed in terms of P(q = q'\l) from ([8]) to produce 

P (no guarantee^) < min /p (q = q'\d) H — l| 
<P(q = q'\d) + 2 ^. 

Thus we can upper bound the probability of inconsistent measurements due to either a large ball projection 
interval or due to unfavorable projection of the ball centers using the union bound. 

P(3v£ Be(x),v' G H e (x'), s.t. q v = q v ,\d) < P (no guarantee^) + P(||0|| 2 > c p ) 

£ p(,^V) + ^ +7 (f,(^), 



where q v and q v > are the quantization values of v, and v', respectively. This proves Lemma 3.2 



D. Consistency Of M Measurements For All Signals In The Space 

To determine the overall quantization performance, we consider bounded norm signals x in a if 
dimensional signal space. Without loss of generality, we assume 1 1 x| 1 2 < 1> and denote the set of all 
such signals using S = {x G M^, 1 1 x 1 1 2 < l}- To consider all the points in S we construct a covering 
using e-balls, such that any signals in S belongs to at least one such ball. The minimum number of balls 
required to cover a signal set is the covering number of the set. For the unit ball in K dimensions, the 



covering number is C e < (3/e) K e-balls |15|. 



Next, we consider all pairs of balls (£> e (x), B e (x!)), such that ||x — x'||2 > d. The number of those is 
upper bounded by the total number of pairs of e-balls we can form from the covering, independent of 
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the distance of their centers, namely (°^) < C e 2 pairs. The probability that at least one pair of vectors, 
one from each ball has M consistent measurements is upper bounded by 

P(M measurements consistent \d) = P(3 v G e (x),v G e (x'), s.t. q v = q V '|^) 

< P(3 v G B e (x),v' G e (x'), s.t. q v = q v <\d) M 

Thus, the probability that there exists at least one pair of balls that contains at least one pair of vectors, 
one from each ball, that quantize to M consistent measurements can be upper bounded using the union 
bound 

P(3 x, x' G S, ||x — x'||2 > d s.t. q = q') < ( — J P(M measurements consistent | d) 

It follows that the probability that we cannot guarantee inconsistency for all vectors with distance 
greater than d is upper bounded by 

P(3 x, x' G S, ||x - x'|| 2 > d S.t. q = q') < 0) ^ (p(q = q'\d) + ^ + 7 (| , ( J) ) " 



o\2K / -. r, / TS o\ \ M 

3\ / 1 1 _(^||.) 2 2c„e 
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2a, 



Picking a = -4=, e = and A = ^fe for some ratios n 5 r2 > we obtain 



if 6c„ \ / 1 1 ^# /if c£if \ \ 



P(3x,x / GcS,||x-x'|| 2 >ds.t. q = q')< ^^^J I - + -e 2 "i + n + 7 ^ ,, • { j j ■ 
By setting c p arbitrarily large, and r\ and r 2 arbitrarily small, we can achieve 

P(3 x, x' G S, ||x - x'|| 2 > d s.t. q = q') < I j (c r ) M , 

where c Q = 6c p /Vir 2 increases as c r decreases, and c r can be any constant arbitrarily close to 1/2. This 



proves Thm. 3.3 Corollary 3.4 follows trivially. 

For example, to make this result concrete, if if > 8 we can pick c p = 2, e = and A = -4= to 
obtain: 



P(3 X,x' G 5, ||x — x'|| 2 > d s.t. q = q') < (^^) ^ = e 2K M^h^<l ) . 

We should remark that the choice of parameters r% , r 2 at the last step — which also determines the 
design of the precision parameter A — influences the decay rate of the error, at a trade-off with the 
leading constant term. While we can obtain a decay rate arbitrarily close to 1/2, we will also force the 
leading term (c \J~K j 'd) 2K to become arbitrarily large. As mentioned before, the decision to decrease A 
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should be done at design time. Furthermore, decreasing A can be difficult in certain practical hardware 
implementations . 

The \[K factor is consistent with scalar quantization of orthonormal basis expansions. Specifically, 
consider the orthonormal basis expansion of the signal, quantized to B bits per coefficient for a total 
of KB bits. The worst-case error per coefficient is 2~(- B ~ 1 ) and, therefore, the total worst-case error is 

To better understand the result, we examine how many bits we require to achieve the same performance 
as fine scalar quantization of orthonormal basis expansions. To provide the same error guarantee we set 



d = 2 ( B ^\J~K. Using Corollary 3.4 to achieve this guarantee with probability Pq we require 



2^ W > ^ (c r # 



M > 2 [ Slog 2 + log 1 / Ior(IAv). 



2P- 



K 

Thus the number of bits per dimension M / K required grows linearly with the bits per dimension 
B required to achieve the same error guarantee in an orthonormal basis expansion. The oversampled 
approach asymptotically requires 21og(2)/log(l/c r ) times the number of bits per dimension, compared 
to fine quantization of orthonormal basis expansions, an overhead which can be designed to be arbitrarily 
close to 2 times. For our example c r = 3/4, 2 log(2)/ log(l/c r ) » 4.82. Although this penalty is 
significant, it is also significantly improved over classical scalar quantization of oversampled expansions. 

IV. Quantization Universality and Signal Models 

A. Universality and Side Information 

One of the advantage of our approach is its universality, in the sense that we did not use any information 
on the signal model in designing the quantizer. This is a significant advantage of randomized sampling 



methods, such as Johnson-Lindenstrauss embedding and Compressive Sensing p3| , |20| , [ |4TJ , |42j. 
Additional information about the signal can be exploited in the reconstruction to improve performance. 
The information available about the signal can take the form of a model on the signal structure, e.g., 



that the signal is sparse, or that it lies in a manifold [13|, |43|-|47|. Alternatively, we might have prior 
knowledge of an existing signal that is very similar to the acquired one (e.g., see [48]). This information 
can be incorporated in the reconstruction to improve the reconstruction quality. It is expected that such 
information can allow us to provide stronger guarantees for the performance of our quantizer. 
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We incorporate side information by modifying the set S of signals of interest. This set affects our 
performance through the number of e-balls required to cover it, known as the covering number of the set. 
In the development above, for i^-dimensional signals with norm bounded by 1 , covering can be achieved 
by C e = (3/e) balls. The results we developed, however, do not rely on any particular covering number 
expression. In general, any set S can be quantized successfully with probability 



P(3 x,x' G S, ||x - x'[| 2 > d s.t. q = q') < (c r ^«" 



where Cf denotes the covering number of the set of interest S as a function of the ball size e, and c a , c r 
are as defined above. 

This observation allows us to quantize known classes of signals, such as sparse signals or signals in a 
union of subspaces. All we need for this characterization is an upper bound for the covering number of 



the set (or its logarithm, i.e., the Kolmogorov e-entropy of the set |49|). The underlying assumption is 
the same as above: that the reconstruction algorithm selects a signal in the set S that is consistent with 
the quantized measurements. 

The Kolmogorov e-entropy of a set provides a lower bound on the number of bits necessary to encode 
the set with worst case distortion e using vector quantization. To achieve this rate, we construct the 
e-covering of the set and use the available bits to enumerate the centers of the e-balls comprising the 
covering. Each signal is quantized to the closest e-ball center, the index of which is used to represent the 
signal. While the connection with vector quantization is well understood in the literature, the results in 
this paper provide, to our knowledge, the first example relating the Kolmogorov e-entropy of a set and 



the achievable performance under scalar quantization. Specifically, using a similar derivation to Cor. 3.4 



the number of bits sufficient to guarantee worst-case distortion d with probability greater than 1 — Po is 

M > ' ° V * (12) 



where logC^. ^ is the e-entropy for e = 3d/c VK. Aside from constants, there is a yK penalty 
over vector quantization in our approach, consistent with the findings in Sec. III-D| 

In the remainder of this section we examine three special cases: Compressive Sensing, signals in a 
union of subspaces, and signals with a known similar signal as side information. 



B. Quantization of Sparse Signals 

Compressive Sensing, one of the recent developments in signal acquisition technology, assumes that 
the acquired signal x contains few non-zero coefficients, i.e., is sparse, when expressed in some basis. 
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This assumption significantly reduces the number of measurements required for acquisition and exact 
reconstruction |13|, (T7J, p9| , [20fl . However, when combined with scalar quantization it can be shown 



that CS measurements are quite inefficient in terms of their rate-distortion trade-off (25J. The cause 
is essentially the same as the cause for the inefficiency of oversampling in the case of non-sparse 
signals: sparse signals occupy a small number of subspaces in the measurement space. Thus, they do not 
intersect most of the available quantization points. The proposed quantization scheme has the potential 
to significantly improve the rate-distortion performance of CS. 

Compressive Sensing examines if -sparse signals in an iV-dimensional space. Thus the signal acquired 
contains up to K non-zero coefficients and, therefore, lies in a if-dimensional subspace out of the (^) 
such subspaces. Since each of the subspaces can be covered with (3/e) K balls, and picking a = -7=, e = 
^p-, and A = the probabilistic guarantee of reconstruction becomes 




d 

2K 



M 



-Vim) (cv) 

^ e 2iflog(^f)-Mlog(l/c r ) 

which decays exponentially with M, as long as M = Q (K log N - K log (Kd)) =U(K log (N/Kd)), 
similar to most Compressive Sensing results. The difference here is that there is an explicit rate-distortion 
guarantee since M represents both the number of measurements and the number of bits used. 

C. Quantization of Signals in a Union of Subspaces 



A more general model is signals in a finite union of subspaces )43|-|46|. Under this model, the signal 
being acquired belongs to one of L if -dimensional subspaces. In this case the reconstruction guarantee 
becomes 



P(3 x,x' G <S, ||x-x'|| 2 > d s.t. q = q') < L 



< e 




which decays exponentially with M, as long as M = Q(logL + K \og{N/d)). Compressive Sensing is 
a special case of signals in a union of subspaces, where L = (^) . 

This result is in contrast with the analysis on unquantized measurement for signals in a union of 



subspaces [43]-[46]. Specifically, these results demonstrate no dependence on, N, the size of the ambient 
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signal space; 0(logL + K) unquantized measurements are sufficient to robustly reconstruct signals from 
a union of subspaces. On the other hand, using an analysis similar to (2j, (25| it is straightforward to show 
that increasing the rate by increasing the number of measurements provides only a linear reduction of the 
error as a function of the number of measurements, similar to the behavior described by ([6]). Alternatively, 
we can consider the Kolmogorov e-entropy, i.e., the minimum number of bits necessary to represent the 
signal set at distortion e, without requiring robustness or imposing linear measurements. This is exactly 
equal to log 2 (Cf ) and suggests that O (log L+K) bits are required. Whether the logarithmic dependence 
on ./V exhibited by our approach is fundamental, due to the requirement for linear measurements, or 
whether it can be removed by different analysis is an interesting question for further research. 



D. Quantization of Similar Signals 

Quite often, the side information is a known signal x s that is very similar to the acquired signal. 
For example, in video applications one frame might be very similar to the next; in multispectral image 
acquisition and compression the acquired signal in one spectral band is very similar to the acquired signal 
in another spectral band (48J. In such cases, knowledge of x s can significantly reduce the number of 
quantized measurements required to acquire the new signal. 

As an example, consider the case where it is known that the acquired signal x differs from the side 
information x s by at most D > ||x s — x|| 2 . Thus the acquired signal exists in the D-ball around x s , 
£vp(xj-). Using the same argument as above, we can construct a covering of a L>-ball using (3D/e) K e- 
balls. Thus, the distortion guarantee becomes 

/ v 2K 

P(3x,x'e«S,||x-x'|| 2 >ds.t. q = q')< ^ {c r ) M . 



If we fix the probability that we fail to guarantee reconstruction performance to Pq, as with Cor. 3.4 the 
distorition guarantee we can provide decreases linearly with D. 



... c D\J K , m_ ■ 
||x- x || 2 > j— (£v)»* => q / q . 

P 2k 

V. Discussion and open questions 

This paper demonstrates universal scalar quantization with exponential decay of the quantization 
error as a function of the oversampling rate (and, consequently, of the bit rate). This allows rate- 
efficient quantization for oversampled signals without any need for methods requiring feedback or joint 
quantization of coefficients, such as Sigma-Delta or vector quantization. The framework we develop is 
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universal and can incorporate side information on the signal, when available. Our development establishes 
a direct connection between the Kolmogorov e-entropy of the measured signals and the achievable rate 
vs. distortion performance under scalar quantization. 

The fundamental realization to enable this performance is that continuous quantization regions (i.e., 
monotonic scalar quantization functions) cause the inherent limitation of scalar quantizers. Using non- 
continuous quantization regions we make more effective use of the quantization bits. While in this 
paper we only analyze binary quantization, it is straightforward to analyze multibit quantizers, shown 
in Fig. [TJc). The only difference is the probability P(q = q'\l) that two arbitrary signals produce a 
consistent measurement in ([8]) and Fig. [3jb). The modified function should be equal to zero in the 
intervals [(2 B i + 1)A, (2 B (i + 1) — 1)A], i = 0, 1, . . ., and equal to ([8]) everywhere else. The remaining 
derivation is identical to the one we presented. We can conjecture that careful analysis of the multibit case 
should present an exponential decay constant c r > 1/2 B , which can reach that lower bound arbitrarily 
close. 

One of the issues not addressed in this work is practical reconstruction algorithms. Reconstruction 
from the proposed sampling scheme is indeed not straightforward. However, we believe that our work 
opens the road to a variety of scalar quantization approaches which can exhibit practical and efficient 
reconstruction algorithms. One approach is to use the results in this paper hierarchically, with a different 
scaling parameter A at each hierarchy level, and, therefore, different reconstruction accuracy guarantees. 
The parameters can be designed such that the reconstruction problem at each level is a convex problem, 
therefore tractable. This approach is explored in more detail in pOj . We defer discussion of other practical 
reconstruction approaches to future work. 

A difficulty in implementing the proposed approach is that the precision parameter A is tightly related 
to the hardware implementation of the quantizer. It is also critical to the performance. If the hardware is 
not precise enough to scale A and produce a fine enough quantization function Q(x), then the asymptotic 
performance of the quantizer degrades. This is generally not an issue in software implementations, e.g., 
in compression applications, assuming we do not reach the limits of machine precision. 

The precision parameter A also has to be designed in advance to accommodate the target accuracy. 
This might be undesirable if the required accuracy of the acquisition system is not known in advance, 
and we hope to decide the number of measurements during the system's operation, maybe after a certain 
number of measurements has already been acquired with a lower precision setting. One approach to 
address this issue is to hierarchically scale the precision parameter, such that the measurements are more 
and more refined as more are acquired. The hierarchical quantization discussed in [50} implements this 
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approach. 

Another topic worthy of further research is performance in the presence of noise. Noise can create 
several problems, such as incorrect quantization bits. Even with infinite quantization precision, noise in 
an inescapable fact of signal acquisition and degrades performance. There are several ways to account 
for noise in this work. One possibility is to limit the size of the precision parameter A such that the 
probability the noise causes the measurement to move by more than A can be safely ignored. This will 
limit the number of bit flips due to noise, and should provide some performance guarantee. It will also 
limit the asymptotic performance of the quantizer. Another possibility is to explore the robust embedding 



properties of the acquisition process, similar to [33]. More precise examination is an open question, also 
for future work. 

An interesting question is the "democratic" property of this quantizer, i.e. how well the information is 



distributed to each quantization bit 1 23 1, [51 1, [52]. This is a desirable property since it provides robustness 



to erasures, something that overcomplete representations are known for |36|, p8) . Superficially it seems 
that the quantizer is indeed democratic. In a probabilistic sense, all the measurements contain the same 
amount of information. Similarities with democratic properties in Compressive Sensing [52] hint that the 
democratic property of our method should be true in an adversarial sense as well. However, we have not 
attempted a proof in this paper. 

Last, we should note that this quantization approach has very tight connections with locality-sensitive 



hashing (LSH) and £2 embeddings under the hamming distance (e.g., see |53| and references within). 
Specifically, our quantization approach effectively constructs such an embedding, some of the properties 
of which are examined in p4| , although not in the same language. A significant difference is on the 
objective. Our goal is to enable reconstruction, whereas the goal of LSH and randomized embeddings is 
to approximately preserve distances with very high probability. A rigorous treatment of the connections 
of quantization and LSH is quite interesting and deserves a publication of its own. A preliminary attempt 
to view LSH as a quantization problem is performed in (55J. 
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